Genomes, genomes everywhere - but where to browse?

نویسنده

  • Lisa J. Mullan
چکیده

Genomes, genomes everywhere – but where to browse? The first full unicellular organism to be sequenced was Saccharomyces cerevisae in 1996 1 and two years later the first multicellular eukaryotic genome was that of Caenorhabditis elegans. 2 Subsequently there has been an explosion in the number of genome projects. Potentially the most useful in medical terms, the Human Genome Project, carried out in 16 centres across the world, announced completion of the working draft sequence in 2001. 3,4 Over 85 per cent of the human genome had been accurately deciphered and a total of 97 per cent of it had been sequenced at least five times. The finished sequence was announced in 2003 with a highly accurate, highly contiguous sequence displaying fewer than one error per 10,000 bases. Only gaps corresponding to regions whose sequence cannot be reliably resolved with current technology remain. The full length human genomic sequence (known as a build) is created automatically using curated Tiling Path Files indicating clone order and overlap with the location of gaps for each individual chromosome. These are provided by the Human Sequencing Consortium and are used together with high-throughput genomic (HTG) sequence reads to complete a map of the entire genome as an ordered set of contigs (known as a scaffold) to include the location of all bases in the correct order across the genome. To populate the browsers with the same scaffold data and provide a frame of reference for researchers, each genome project is frozen at a particular stage in finishing the sequence. This freeze is the build that is used as the framework onto which each browser hangs the annotation. Other genomes are built using a similar method. This is an automated process, and involves the construction of a region spanning billions of bases. Errors during this process result in the absence of a specific region of the genome. This happens to well-known regions as well as novel regions and will be fixed if the relevant consortium is alerted. As the build is currently dynamic, the remaining gaps are constantly being filled and chromosomal coordinates may change with each assembly. The wealth of data that has accumulated as a result of this myriad of genome sequencing projects around the world has opened the door on further understanding of the mechanisms by which biological functions are achieved. This global genome project is so large, however, that making the …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Profile of Eight Prophage Sequences Present in the Genomes of Different Acinetobacter baumannii Strains

ABSTRACT           Background and Objective: Prophage sequences are major contributors to interstrain variations within the same bacterial species. Acinetobacter baumannii is a gram-negative bacterium that causes a wide range of nosocomial infections, especially in intensive care unit inpatients. Prophage sequences constitute a considerable proporti...

متن کامل

Acquired Antimicrobial Resistance Genes of Escherichia coli Obtained from Nigeria: In silico Genome Analysis

Background: Antimicrobial resistance is a global problem with enormous public health and economic impact. This study was carried out to get an overview of acquired antimicrobial resistance gene sequences in the genomes of Escherichia coli isolated from different food sources and the environment in Nigeria. Methods: To determine the acquired antimicrobial-resistant genes prevalence, genome asse...

متن کامل

Evaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes

Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded  DNA virus. There were two approaches for prediction of each Markov Model parameter,...

متن کامل

iMITEdb: the genome-wide landscape of miniature inverted-repeat transposable elements in insects

Miniature inverted-repeat transposable elements (MITEs) have attracted much attention due to their widespread occurrence and high copy numbers in eukaryotic genomes. However, the systematic knowledge about MITEs in insects and other animals is still lacking. In this study, we identified 6012 MITE families from 98 insect species genomes. Comparison of these MITEs with known MITEs in the NCBI non...

متن کامل

Species Specific DNA Profiling Mycobacterial Genomes Using Polymerase Chain Reaction with Single Universal Primer (UP-PCR)

Three tuberculous, twenty-one non-tuberculous mycobacterial (NTM) reference strains and seventy two isolates classified by biochemical tests were shown to produce specific sets of DNA fragments in a polymerase chain reaction with single universal primer (UP-PCR). A rather wide limit of tolerance for variations in procedure of PCR mixture preparation and thermocycling parameters was found. There...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Briefings in bioinformatics

دوره 5 4  شماره 

صفحات  -

تاریخ انتشار 2004